综合智慧能源 ›› 2024, Vol. 46 ›› Issue (10): 32-39.doi: 10.3969/j.issn.2097-0706.2024.10.005

• 电网与人工智能 • 上一篇    下一篇

基于多智能体强化学习的配电网电压分散控制

马刚1,2(), 马健2, 颜云松1, 陈永华1, 赖业宁1, 李祝昆1, 唐靖1   

  1. 1.国电南瑞科技股份有限公司 电网运行风险防御技术与装备全国重点实验室, 南京 211106
    2.南京师范大学 电气与自动化工程学院,南京 210023
  • 收稿日期:2024-07-15 修回日期:2024-08-27 接受日期:2024-09-13 出版日期:2024-10-25
  • 作者简介:马刚(1984),男,教授,博士,从事新能源发电及并网研究,nnumg@njnu.edu.cn
    马健(2001),男,硕士生,从事电力系统安全稳定控制研究;
    颜云松(1981),男,正高级工程师,硕士,从事电力系统安全稳定控制研究;
    陈永华(1979),男,正高级工程师,从事电力系统安全稳定控制研究;
    赖业宁(1975),男,正高级工程师,博士,从事电力系统安全稳定控制研究。
  • 基金资助:
    江苏省重点研发计划项目(BK20232026);智能电网保护和运行控制国家重点实验室课题(SGNR0000KTTS2302147)

Decentralized voltage control of distribution network based on multi-agent reinforcement learning

MA Gang1,2(), MA Jian2, YAN Yunsong1, CHEN Yonghua1, LAI Yening1, LI Zhukun1, TANG Jing1   

  1. 1. State Key Laboratory of Technology and Equipment for Defense Against Power System Operational Risks, Nari Technology Company Limited, Nanjing 211106, China
    2. School of Electrical and Automation Engineering,Nanjing Normal University, Nanjing 210023, China
  • Received:2024-07-15 Revised:2024-08-27 Accepted:2024-09-13 Published:2024-10-25
  • Supported by:
    Science and Technology Project of Jiangsu Province(BK20232026);State Key Laboratory of Smart Grid Protection and Operation Control(SGNR0000KTTS2302147)

摘要:

大规模分散资源接入配电网改变了传统配电网的潮流分布,导致电压频繁越限。以模型为基础的电压控制方法对电力系统网络拓扑结构要求较高,求解时间较长,不能达到电压实时控制要求。为此,提出一种考虑异步训练的多智能体在线学习配电网电压分散控制策略。该方法将每个光伏逆变器都视为一个智能体。首先对智能体进行分区调整,然后将配电网的电压无功控制问题建模为马尔可夫决策过程,在满足系统分布式约束的基础上,采用多智能体强化学习分散控制框架,结合多智能体深度确定性策略梯度算法对多智能体进行训练。经过训练的智能体可以不需要实时通信,利用局部信息实现分散决策,制定光伏逆变器的出力计划,做到电压实时控制,减少网络损耗。最后,通过仿真验证了该方法的有效性和鲁棒性。

关键词: 配电网, 多智能体, 电压分散控制, 多智能体深度确定性策略梯度算法, 马尔可夫决策过程

Abstract:

The integration of large-scale decentralized resources into the distribution network has changed the traditional power flow distribution, resulting in frequent voltage violations. Model-based voltage control methods require a detailed knowledge of power system network topology and have long computation time, making them unsuitable for real-time voltage control.To address this, this paper proposes a multi-agent online learning strategy for decentralized voltage control in distribution networks, considering asynchronous training. The method considered each photovoltaic (PV) inverter as an agent. First, the agents were partitioned and adjusted, then the voltage reactive power control problem of distribution network was modelled as a Markov decision process. Based on distributed system constraints, a multi-agent reinforcement learning decentralized control framework was used, and agents were trained with a multi-agent deep deterministic policy gradient(MADDPG) algorithm. Once trained, the agents could make decentralized decisions using local information without real-time communication, enabling real-time voltage control and reducing network losses by determining the output plan for the PV inverters. Finally, the effectiveness and robustness of the method were verified through simulation.

Key words: distribution network, multi-agent, voltage decentralized control, multi-agent deep deterministic policy gradient, Markov decision process

中图分类号: